Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 20640 |
| Missing cells | 207 |
| Missing cells (%) | 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.7 MiB |
| Average record size in memory | 137.1 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 1 |
Reproduction
| Analysis started | 2020-06-03 07:22:48.937533 |
|---|---|
| Analysis finished | 2020-06-03 07:23:09.430083 |
| Duration | 20.49 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
latitude is highly correlated with longitude | High correlation |
longitude is highly correlated with latitude | High correlation |
total_bedrooms is highly correlated with total_rooms and 1 other fields | High correlation |
total_rooms is highly correlated with total_bedrooms and 1 other fields | High correlation |
households is highly correlated with total_rooms and 2 other fields | High correlation |
population is highly correlated with households | High correlation |
total_bedrooms has 207 (1.0%) missing values | Missing |
| Distinct count | 844 |
|---|---|
| Unique (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -119.56970445736432 |
|---|---|
| Minimum | -124.35 |
| Maximum | -114.31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | -124.35 |
|---|---|
| 5-th percentile | -122.47 |
| Q1 | -121.8 |
| median | -118.49 |
| Q3 | -118.01 |
| 95-th percentile | -117.08 |
| Maximum | -114.31 |
| Range | 10.04 |
| Interquartile range (IQR) | 3.79 |
Descriptive statistics
| Standard deviation | 2.003531724 |
|---|---|
| Coefficient of variation (CV) | -0.01675618195 |
| Kurtosis | -1.330152366 |
| Mean | -119.5697045 |
| Median Absolute Deviation (MAD) | 1.28 |
| Skewness | -0.297801208 |
| Sum | -2467918.7 |
| Variance | 4.014139367 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -118.31 | 162 | 0.8% | |
| -118.3 | 160 | 0.8% | |
| -118.29 | 148 | 0.7% | |
| -118.27 | 144 | 0.7% | |
| -118.32 | 142 | 0.7% | |
| -118.28 | 141 | 0.7% | |
| -118.35 | 140 | 0.7% | |
| -118.36 | 138 | 0.7% | |
| -118.19 | 135 | 0.7% | |
| -118.25 | 128 | 0.6% | |
| -118.37 | 128 | 0.6% | |
| -118.2 | 126 | 0.6% | |
| -118.14 | 125 | 0.6% | |
| -118.26 | 121 | 0.6% | |
| -118.13 | 121 | 0.6% | |
| -118.18 | 120 | 0.6% | |
| -118.34 | 119 | 0.6% | |
| -118.21 | 118 | 0.6% | |
| -118.15 | 116 | 0.6% | |
| -118.12 | 112 | 0.5% | |
| -118.1 | 109 | 0.5% | |
| -118.38 | 107 | 0.5% | |
| -118.17 | 106 | 0.5% | |
| -118.43 | 106 | 0.5% | |
| -118.16 | 103 | 0.5% | |
| Other values (819) | 17465 | 84.6% |
| Value | Count | Frequency (%) | |
| -124.35 | 1 | < 0.1% | |
| -124.3 | 2 | < 0.1% | |
| -124.27 | 1 | < 0.1% | |
| -124.26 | 1 | < 0.1% | |
| -124.25 | 1 | < 0.1% | |
| -124.23 | 3 | < 0.1% | |
| -124.22 | 1 | < 0.1% | |
| -124.21 | 3 | < 0.1% | |
| -124.19 | 4 | < 0.1% | |
| -124.18 | 6 | < 0.1% |
| Value | Count | Frequency (%) | |
| -114.31 | 1 | < 0.1% | |
| -114.47 | 1 | < 0.1% | |
| -114.49 | 1 | < 0.1% | |
| -114.55 | 1 | < 0.1% | |
| -114.56 | 1 | < 0.1% | |
| -114.57 | 3 | < 0.1% | |
| -114.58 | 2 | < 0.1% | |
| -114.59 | 2 | < 0.1% | |
| -114.6 | 3 | < 0.1% | |
| -114.61 | 3 | < 0.1% |
| Distinct count | 862 |
|---|---|
| Unique (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.63186143410853 |
|---|---|
| Minimum | 32.54 |
| Maximum | 41.95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 32.54 |
|---|---|
| 5-th percentile | 32.82 |
| Q1 | 33.93 |
| median | 34.26 |
| Q3 | 37.71 |
| 95-th percentile | 38.96 |
| Maximum | 41.95 |
| Range | 9.41 |
| Interquartile range (IQR) | 3.78 |
Descriptive statistics
| Standard deviation | 2.135952397 |
|---|---|
| Coefficient of variation (CV) | 0.05994501302 |
| Kurtosis | -1.117759781 |
| Mean | 35.63186143 |
| Median Absolute Deviation (MAD) | 1.23 |
| Skewness | 0.4659530037 |
| Sum | 735441.62 |
| Variance | 4.562292644 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 34.06 | 244 | 1.2% | |
| 34.05 | 236 | 1.1% | |
| 34.08 | 234 | 1.1% | |
| 34.07 | 231 | 1.1% | |
| 34.04 | 221 | 1.1% | |
| 34.09 | 212 | 1.0% | |
| 34.02 | 208 | 1.0% | |
| 34.1 | 203 | 1.0% | |
| 34.03 | 193 | 0.9% | |
| 33.93 | 181 | 0.9% | |
| 33.94 | 175 | 0.8% | |
| 33.97 | 172 | 0.8% | |
| 33.99 | 168 | 0.8% | |
| 33.88 | 164 | 0.8% | |
| 34.11 | 162 | 0.8% | |
| 33.98 | 162 | 0.8% | |
| 34.16 | 159 | 0.8% | |
| 34.12 | 158 | 0.8% | |
| 34.15 | 157 | 0.8% | |
| 34.01 | 156 | 0.8% | |
| 33.89 | 154 | 0.7% | |
| 34.17 | 154 | 0.7% | |
| 34.14 | 152 | 0.7% | |
| 33.9 | 152 | 0.7% | |
| 34 | 152 | 0.7% | |
| Other values (837) | 16080 | 77.9% |
| Value | Count | Frequency (%) | |
| 32.54 | 1 | < 0.1% | |
| 32.55 | 3 | < 0.1% | |
| 32.56 | 10 | < 0.1% | |
| 32.57 | 18 | 0.1% | |
| 32.58 | 26 | 0.1% | |
| 32.59 | 11 | 0.1% | |
| 32.6 | 9 | < 0.1% | |
| 32.61 | 14 | 0.1% | |
| 32.62 | 13 | 0.1% | |
| 32.63 | 18 | 0.1% |
| Value | Count | Frequency (%) | |
| 41.95 | 2 | < 0.1% | |
| 41.92 | 1 | < 0.1% | |
| 41.88 | 1 | < 0.1% | |
| 41.86 | 3 | < 0.1% | |
| 41.84 | 1 | < 0.1% | |
| 41.82 | 1 | < 0.1% | |
| 41.81 | 2 | < 0.1% | |
| 41.8 | 3 | < 0.1% | |
| 41.79 | 1 | < 0.1% | |
| 41.78 | 3 | < 0.1% |
housing_median_age
Real number (ℝ≥0)
| Distinct count | 52 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.639486434108527 |
|---|---|
| Minimum | 1.0 |
| Maximum | 52.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 18 |
| median | 29 |
| Q3 | 37 |
| 95-th percentile | 52 |
| Maximum | 52 |
| Range | 51 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 12.58555761 |
|---|---|
| Coefficient of variation (CV) | 0.4394477408 |
| Kurtosis | -0.8006288536 |
| Mean | 28.63948643 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.0603306376 |
| Sum | 591119 |
| Variance | 158.3962604 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 52 | 1273 | 6.2% | |
| 36 | 862 | 4.2% | |
| 35 | 824 | 4.0% | |
| 16 | 771 | 3.7% | |
| 17 | 698 | 3.4% | |
| 34 | 689 | 3.3% | |
| 26 | 619 | 3.0% | |
| 33 | 615 | 3.0% | |
| 18 | 570 | 2.8% | |
| 25 | 566 | 2.7% | |
| 32 | 565 | 2.7% | |
| 37 | 537 | 2.6% | |
| 15 | 512 | 2.5% | |
| 19 | 502 | 2.4% | |
| 27 | 488 | 2.4% | |
| 24 | 478 | 2.3% | |
| 30 | 476 | 2.3% | |
| 28 | 471 | 2.3% | |
| 20 | 465 | 2.3% | |
| 29 | 461 | 2.2% | |
| 31 | 458 | 2.2% | |
| 23 | 448 | 2.2% | |
| 21 | 446 | 2.2% | |
| 14 | 412 | 2.0% | |
| 22 | 399 | 1.9% | |
| Other values (27) | 6035 | 29.2% |
| Value | Count | Frequency (%) | |
| 1 | 4 | < 0.1% | |
| 2 | 58 | 0.3% | |
| 3 | 62 | 0.3% | |
| 4 | 191 | 0.9% | |
| 5 | 244 | 1.2% | |
| 6 | 160 | 0.8% | |
| 7 | 175 | 0.8% | |
| 8 | 206 | 1.0% | |
| 9 | 205 | 1.0% | |
| 10 | 264 | 1.3% |
| Value | Count | Frequency (%) | |
| 52 | 1273 | 6.2% | |
| 51 | 48 | 0.2% | |
| 50 | 136 | 0.7% | |
| 49 | 134 | 0.6% | |
| 48 | 177 | 0.9% | |
| 47 | 198 | 1.0% | |
| 46 | 245 | 1.2% | |
| 45 | 294 | 1.4% | |
| 44 | 356 | 1.7% | |
| 43 | 353 | 1.7% |
| Distinct count | 5926 |
|---|---|
| Unique (%) | 28.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2635.7630813953488 |
|---|---|
| Minimum | 2.0 |
| Maximum | 39320.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 620.95 |
| Q1 | 1447.75 |
| median | 2127 |
| Q3 | 3148 |
| 95-th percentile | 6213.2 |
| Maximum | 39320 |
| Range | 39318 |
| Interquartile range (IQR) | 1700.25 |
Descriptive statistics
| Standard deviation | 2181.615252 |
|---|---|
| Coefficient of variation (CV) | 0.8276977802 |
| Kurtosis | 32.630927 |
| Mean | 2635.763081 |
| Median Absolute Deviation (MAD) | 797 |
| Skewness | 4.147343451 |
| Sum | 54402150 |
| Variance | 4759445.106 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1527 | 18 | 0.1% | |
| 1613 | 17 | 0.1% | |
| 1582 | 17 | 0.1% | |
| 2127 | 16 | 0.1% | |
| 1703 | 15 | 0.1% | |
| 1471 | 15 | 0.1% | |
| 2053 | 15 | 0.1% | |
| 1722 | 15 | 0.1% | |
| 1607 | 15 | 0.1% | |
| 1717 | 15 | 0.1% | |
| 1787 | 14 | 0.1% | |
| 1705 | 14 | 0.1% | |
| 1743 | 14 | 0.1% | |
| 1650 | 14 | 0.1% | |
| 1880 | 14 | 0.1% | |
| 1731 | 14 | 0.1% | |
| 1745 | 14 | 0.1% | |
| 1724 | 14 | 0.1% | |
| 1562 | 14 | 0.1% | |
| 1808 | 13 | 0.1% | |
| 1999 | 13 | 0.1% | |
| 1551 | 13 | 0.1% | |
| 1748 | 13 | 0.1% | |
| 1649 | 13 | 0.1% | |
| 1701 | 13 | 0.1% | |
| Other values (5901) | 20278 | 98.2% |
| Value | Count | Frequency (%) | |
| 2 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 12 | 1 | < 0.1% | |
| 15 | 2 | < 0.1% | |
| 16 | 1 | < 0.1% | |
| 18 | 4 | < 0.1% | |
| 19 | 2 | < 0.1% | |
| 20 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 39320 | 1 | < 0.1% | |
| 37937 | 1 | < 0.1% | |
| 32627 | 1 | < 0.1% | |
| 32054 | 1 | < 0.1% | |
| 30450 | 1 | < 0.1% | |
| 30405 | 1 | < 0.1% | |
| 30401 | 1 | < 0.1% | |
| 28258 | 1 | < 0.1% | |
| 27870 | 1 | < 0.1% | |
| 27700 | 1 | < 0.1% |
| Distinct count | 1923 |
|---|---|
| Unique (%) | 9.4% |
| Missing | 207 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 537.8705525375618 |
|---|---|
| Minimum | 1.0 |
| Maximum | 6445.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 137 |
| Q1 | 296 |
| median | 435 |
| Q3 | 647 |
| 95-th percentile | 1275.4 |
| Maximum | 6445 |
| Range | 6444 |
| Interquartile range (IQR) | 351 |
Descriptive statistics
| Standard deviation | 421.3850701 |
|---|---|
| Coefficient of variation (CV) | 0.7834321252 |
| Kurtosis | 21.98557506 |
| Mean | 537.8705525 |
| Median Absolute Deviation (MAD) | 162 |
| Skewness | 3.459546332 |
| Sum | 10990309 |
| Variance | 177565.3773 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 280 | 55 | 0.3% | |
| 331 | 51 | 0.2% | |
| 345 | 50 | 0.2% | |
| 393 | 49 | 0.2% | |
| 343 | 49 | 0.2% | |
| 394 | 48 | 0.2% | |
| 328 | 48 | 0.2% | |
| 348 | 48 | 0.2% | |
| 272 | 47 | 0.2% | |
| 309 | 47 | 0.2% | |
| 295 | 46 | 0.2% | |
| 314 | 46 | 0.2% | |
| 322 | 46 | 0.2% | |
| 399 | 46 | 0.2% | |
| 317 | 46 | 0.2% | |
| 284 | 45 | 0.2% | |
| 388 | 45 | 0.2% | |
| 290 | 45 | 0.2% | |
| 291 | 45 | 0.2% | |
| 346 | 45 | 0.2% | |
| 287 | 45 | 0.2% | |
| 340 | 45 | 0.2% | |
| 313 | 45 | 0.2% | |
| 269 | 44 | 0.2% | |
| 460 | 44 | 0.2% | |
| Other values (1898) | 19263 | 93.3% | |
| (Missing) | 207 | 1.0% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 2 | < 0.1% | |
| 3 | 5 | < 0.1% | |
| 4 | 7 | < 0.1% | |
| 5 | 6 | < 0.1% | |
| 6 | 5 | < 0.1% | |
| 7 | 6 | < 0.1% | |
| 8 | 8 | < 0.1% | |
| 9 | 7 | < 0.1% | |
| 10 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6445 | 1 | < 0.1% | |
| 6210 | 1 | < 0.1% | |
| 5471 | 1 | < 0.1% | |
| 5419 | 1 | < 0.1% | |
| 5290 | 1 | < 0.1% | |
| 5033 | 1 | < 0.1% | |
| 5027 | 1 | < 0.1% | |
| 4957 | 1 | < 0.1% | |
| 4952 | 1 | < 0.1% | |
| 4819 | 1 | < 0.1% |
| Distinct count | 3888 |
|---|---|
| Unique (%) | 18.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1425.4767441860465 |
|---|---|
| Minimum | 3.0 |
| Maximum | 35682.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 348 |
| Q1 | 787 |
| median | 1166 |
| Q3 | 1725 |
| 95-th percentile | 3288 |
| Maximum | 35682 |
| Range | 35679 |
| Interquartile range (IQR) | 938 |
Descriptive statistics
| Standard deviation | 1132.462122 |
|---|---|
| Coefficient of variation (CV) | 0.7944444737 |
| Kurtosis | 73.55311639 |
| Mean | 1425.476744 |
| Median Absolute Deviation (MAD) | 440 |
| Skewness | 4.935858227 |
| Sum | 29421840 |
| Variance | 1282470.457 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 891 | 25 | 0.1% | |
| 761 | 24 | 0.1% | |
| 1227 | 24 | 0.1% | |
| 850 | 24 | 0.1% | |
| 1052 | 24 | 0.1% | |
| 825 | 23 | 0.1% | |
| 999 | 22 | 0.1% | |
| 782 | 22 | 0.1% | |
| 1005 | 22 | 0.1% | |
| 781 | 21 | 0.1% | |
| 1098 | 21 | 0.1% | |
| 753 | 21 | 0.1% | |
| 872 | 21 | 0.1% | |
| 1056 | 20 | 0.1% | |
| 1158 | 20 | 0.1% | |
| 899 | 20 | 0.1% | |
| 837 | 20 | 0.1% | |
| 804 | 20 | 0.1% | |
| 1011 | 20 | 0.1% | |
| 926 | 20 | 0.1% | |
| 1155 | 20 | 0.1% | |
| 1203 | 20 | 0.1% | |
| 1047 | 20 | 0.1% | |
| 986 | 20 | 0.1% | |
| 861 | 20 | 0.1% | |
| Other values (3863) | 20106 | 97.4% |
| Value | Count | Frequency (%) | |
| 3 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 8 | 4 | < 0.1% | |
| 9 | 2 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 13 | 4 | < 0.1% | |
| 14 | 3 | < 0.1% | |
| 15 | 2 | < 0.1% | |
| 17 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 35682 | 1 | < 0.1% | |
| 28566 | 1 | < 0.1% | |
| 16305 | 1 | < 0.1% | |
| 16122 | 1 | < 0.1% | |
| 15507 | 1 | < 0.1% | |
| 15037 | 1 | < 0.1% | |
| 13251 | 1 | < 0.1% | |
| 12873 | 1 | < 0.1% | |
| 12427 | 1 | < 0.1% | |
| 12203 | 1 | < 0.1% |
| Distinct count | 1815 |
|---|---|
| Unique (%) | 8.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 499.5396802325581 |
|---|---|
| Minimum | 1.0 |
| Maximum | 6082.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 125 |
| Q1 | 280 |
| median | 409 |
| Q3 | 605 |
| 95-th percentile | 1162 |
| Maximum | 6082 |
| Range | 6081 |
| Interquartile range (IQR) | 325 |
Descriptive statistics
| Standard deviation | 382.3297528 |
|---|---|
| Coefficient of variation (CV) | 0.7653641301 |
| Kurtosis | 22.05798806 |
| Mean | 499.5396802 |
| Median Absolute Deviation (MAD) | 151 |
| Skewness | 3.410437712 |
| Sum | 10310499 |
| Variance | 146176.0399 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 306 | 57 | 0.3% | |
| 386 | 56 | 0.3% | |
| 335 | 56 | 0.3% | |
| 282 | 55 | 0.3% | |
| 429 | 54 | 0.3% | |
| 375 | 53 | 0.3% | |
| 284 | 51 | 0.2% | |
| 297 | 51 | 0.2% | |
| 362 | 50 | 0.2% | |
| 380 | 50 | 0.2% | |
| 278 | 50 | 0.2% | |
| 340 | 50 | 0.2% | |
| 316 | 49 | 0.2% | |
| 329 | 49 | 0.2% | |
| 319 | 49 | 0.2% | |
| 330 | 49 | 0.2% | |
| 377 | 48 | 0.2% | |
| 309 | 48 | 0.2% | |
| 426 | 48 | 0.2% | |
| 341 | 48 | 0.2% | |
| 357 | 47 | 0.2% | |
| 352 | 46 | 0.2% | |
| 363 | 46 | 0.2% | |
| 410 | 46 | 0.2% | |
| 269 | 46 | 0.2% | |
| Other values (1790) | 19388 | 93.9% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 3 | < 0.1% | |
| 3 | 4 | < 0.1% | |
| 4 | 4 | < 0.1% | |
| 5 | 7 | < 0.1% | |
| 6 | 5 | < 0.1% | |
| 7 | 10 | < 0.1% | |
| 8 | 8 | < 0.1% | |
| 9 | 9 | < 0.1% | |
| 10 | 7 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6082 | 1 | < 0.1% | |
| 5358 | 1 | < 0.1% | |
| 5189 | 1 | < 0.1% | |
| 5050 | 1 | < 0.1% | |
| 4930 | 1 | < 0.1% | |
| 4855 | 1 | < 0.1% | |
| 4769 | 1 | < 0.1% | |
| 4616 | 1 | < 0.1% | |
| 4490 | 1 | < 0.1% | |
| 4372 | 1 | < 0.1% |
median_income
Real number (ℝ≥0)
| Distinct count | 12928 |
|---|---|
| Unique (%) | 62.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.8706710029069766 |
|---|---|
| Minimum | 0.4999 |
| Maximum | 15.0001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 0.4999 |
|---|---|
| 5-th percentile | 1.60057 |
| Q1 | 2.5634 |
| median | 3.5348 |
| Q3 | 4.74325 |
| 95-th percentile | 7.300305 |
| Maximum | 15.0001 |
| Range | 14.5002 |
| Interquartile range (IQR) | 2.17985 |
Descriptive statistics
| Standard deviation | 1.899821718 |
|---|---|
| Coefficient of variation (CV) | 0.4908249026 |
| Kurtosis | 4.952524102 |
| Mean | 3.870671003 |
| Median Absolute Deviation (MAD) | 1.0642 |
| Skewness | 1.646656702 |
| Sum | 79890.6495 |
| Variance | 3.60932256 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3.125 | 49 | 0.2% | |
| 15.0001 | 49 | 0.2% | |
| 2.875 | 46 | 0.2% | |
| 4.125 | 44 | 0.2% | |
| 2.625 | 44 | 0.2% | |
| 3.875 | 41 | 0.2% | |
| 3 | 38 | 0.2% | |
| 3.375 | 38 | 0.2% | |
| 3.625 | 37 | 0.2% | |
| 4 | 37 | 0.2% | |
| 4.375 | 35 | 0.2% | |
| 2.125 | 33 | 0.2% | |
| 2.375 | 32 | 0.2% | |
| 4.625 | 31 | 0.2% | |
| 3.5 | 30 | 0.1% | |
| 3.25 | 29 | 0.1% | |
| 3.75 | 29 | 0.1% | |
| 4.875 | 29 | 0.1% | |
| 1.625 | 29 | 0.1% | |
| 2.25 | 29 | 0.1% | |
| 4.25 | 28 | 0.1% | |
| 2.5 | 28 | 0.1% | |
| 3.6875 | 26 | 0.1% | |
| 2.75 | 25 | 0.1% | |
| 4.5 | 24 | 0.1% | |
| Other values (12903) | 19780 | 95.8% |
| Value | Count | Frequency (%) | |
| 0.4999 | 12 | 0.1% | |
| 0.536 | 10 | < 0.1% | |
| 0.5495 | 1 | < 0.1% | |
| 0.6433 | 1 | < 0.1% | |
| 0.6775 | 1 | < 0.1% | |
| 0.6825 | 1 | < 0.1% | |
| 0.6831 | 1 | < 0.1% | |
| 0.696 | 1 | < 0.1% | |
| 0.6991 | 1 | < 0.1% | |
| 0.7007 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 15.0001 | 49 | 0.2% | |
| 15 | 2 | < 0.1% | |
| 14.9009 | 1 | < 0.1% | |
| 14.5833 | 1 | < 0.1% | |
| 14.4219 | 1 | < 0.1% | |
| 14.4113 | 1 | < 0.1% | |
| 14.2959 | 1 | < 0.1% | |
| 14.2867 | 1 | < 0.1% | |
| 13.947 | 1 | < 0.1% | |
| 13.8556 | 1 | < 0.1% |
median_house_value
Real number (ℝ≥0)
| Distinct count | 3842 |
|---|---|
| Unique (%) | 18.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 206855.81690891474 |
|---|---|
| Minimum | 14999.0 |
| Maximum | 500001.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 14999 |
|---|---|
| 5-th percentile | 66200 |
| Q1 | 119600 |
| median | 179700 |
| Q3 | 264725 |
| 95-th percentile | 489810 |
| Maximum | 500001 |
| Range | 485002 |
| Interquartile range (IQR) | 145125 |
Descriptive statistics
| Standard deviation | 115395.6159 |
|---|---|
| Coefficient of variation (CV) | 0.55785531 |
| Kurtosis | 0.3278702429 |
| Mean | 206855.8169 |
| Median Absolute Deviation (MAD) | 68400 |
| Skewness | 0.9777632739 |
| Sum | 4269504061 |
| Variance | 1.331614816e+10 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 500001 | 965 | 4.7% | |
| 137500 | 122 | 0.6% | |
| 162500 | 117 | 0.6% | |
| 112500 | 103 | 0.5% | |
| 187500 | 93 | 0.5% | |
| 225000 | 92 | 0.4% | |
| 350000 | 79 | 0.4% | |
| 87500 | 78 | 0.4% | |
| 275000 | 65 | 0.3% | |
| 150000 | 64 | 0.3% | |
| 175000 | 63 | 0.3% | |
| 100000 | 62 | 0.3% | |
| 125000 | 56 | 0.3% | |
| 67500 | 55 | 0.3% | |
| 250000 | 47 | 0.2% | |
| 200000 | 46 | 0.2% | |
| 118800 | 39 | 0.2% | |
| 450000 | 37 | 0.2% | |
| 156300 | 35 | 0.2% | |
| 212500 | 33 | 0.2% | |
| 193800 | 31 | 0.2% | |
| 181300 | 31 | 0.2% | |
| 300000 | 30 | 0.1% | |
| 75000 | 30 | 0.1% | |
| 81300 | 29 | 0.1% | |
| Other values (3817) | 18238 | 88.4% |
| Value | Count | Frequency (%) | |
| 14999 | 4 | < 0.1% | |
| 17500 | 1 | < 0.1% | |
| 22500 | 4 | < 0.1% | |
| 25000 | 1 | < 0.1% | |
| 26600 | 1 | < 0.1% | |
| 26900 | 1 | < 0.1% | |
| 27500 | 1 | < 0.1% | |
| 28300 | 1 | < 0.1% | |
| 30000 | 2 | < 0.1% | |
| 32500 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 500001 | 965 | 4.7% | |
| 500000 | 27 | 0.1% | |
| 499100 | 1 | < 0.1% | |
| 499000 | 1 | < 0.1% | |
| 498800 | 1 | < 0.1% | |
| 498700 | 1 | < 0.1% | |
| 498600 | 1 | < 0.1% | |
| 498400 | 1 | < 0.1% | |
| 497600 | 1 | < 0.1% | |
| 497400 | 1 | < 0.1% |
ocean_proximity
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.4 KiB |
| <1H OCEAN | |
|---|---|
| INLAND | |
| NEAR OCEAN | |
| NEAR BAY | |
| ISLAND | 5 |
| Value | Count | Frequency (%) | |
| <1H OCEAN | 9136 | 44.3% | |
| INLAND | 6551 | 31.7% | |
| NEAR OCEAN | 2658 | 12.9% | |
| NEAR BAY | 2290 | 11.1% | |
| ISLAND | 5 | < 0.1% |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.064922481 |
| Min length | 6 |
Most occurring characters
| Value | Count | Frequency (%) | |
| N | 29849 | 17.9% | |
| A | 25588 | 15.4% | |
| E | 16742 | 10.1% | |
| 14084 | 8.5% | ||
| O | 11794 | 7.1% | |
| C | 11794 | 7.1% | |
| < | 9136 | 5.5% | |
| 1 | 9136 | 5.5% | |
| H | 9136 | 5.5% | |
| I | 6556 | 3.9% | |
| L | 6556 | 3.9% | |
| D | 6556 | 3.9% | |
| R | 4948 | 3.0% | |
| B | 2290 | 1.4% | |
| Y | 2290 | 1.4% | |
| S | 5 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Uppercase Letter | 134104 | 80.6% | |
| Space Separator | 14084 | 8.5% | |
| Math Symbol | 9136 | 5.5% | |
| Decimal Number | 9136 | 5.5% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| N | 29849 | 22.3% | |
| A | 25588 | 19.1% | |
| E | 16742 | 12.5% | |
| O | 11794 | 8.8% | |
| C | 11794 | 8.8% | |
| H | 9136 | 6.8% | |
| I | 6556 | 4.9% | |
| L | 6556 | 4.9% | |
| D | 6556 | 4.9% | |
| R | 4948 | 3.7% | |
| B | 2290 | 1.7% | |
| Y | 2290 | 1.7% | |
| S | 5 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 14084 | 100.0% |
Most frequent Math Symbol characters
| Value | Count | Frequency (%) | |
| < | 9136 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 9136 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 134104 | 80.6% | |
| Common | 32356 | 19.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| N | 29849 | 22.3% | |
| A | 25588 | 19.1% | |
| E | 16742 | 12.5% | |
| O | 11794 | 8.8% | |
| C | 11794 | 8.8% | |
| H | 9136 | 6.8% | |
| I | 6556 | 4.9% | |
| L | 6556 | 4.9% | |
| D | 6556 | 4.9% | |
| R | 4948 | 3.7% | |
| B | 2290 | 1.7% | |
| Y | 2290 | 1.7% | |
| S | 5 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 14084 | 43.5% | ||
| < | 9136 | 28.2% | |
| 1 | 9136 | 28.2% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 166460 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| N | 29849 | 17.9% | |
| A | 25588 | 15.4% | |
| E | 16742 | 10.1% | |
| 14084 | 8.5% | ||
| O | 11794 | 7.1% | |
| C | 11794 | 7.1% | |
| < | 9136 | 5.5% | |
| 1 | 9136 | 5.5% | |
| H | 9136 | 5.5% | |
| I | 6556 | 3.9% | |
| L | 6556 | 3.9% | |
| D | 6556 | 3.9% | |
| R | 4948 | 3.0% | |
| B | 2290 | 1.4% | |
| Y | 2290 | 1.4% | |
| S | 5 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | 452600.0 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | 358500.0 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | 352100.0 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | 341300.0 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | 342200.0 | NEAR BAY |
| 5 | -122.25 | 37.85 | 52.0 | 919.0 | 213.0 | 413.0 | 193.0 | 4.0368 | 269700.0 | NEAR BAY |
| 6 | -122.25 | 37.84 | 52.0 | 2535.0 | 489.0 | 1094.0 | 514.0 | 3.6591 | 299200.0 | NEAR BAY |
| 7 | -122.25 | 37.84 | 52.0 | 3104.0 | 687.0 | 1157.0 | 647.0 | 3.1200 | 241400.0 | NEAR BAY |
| 8 | -122.26 | 37.84 | 42.0 | 2555.0 | 665.0 | 1206.0 | 595.0 | 2.0804 | 226700.0 | NEAR BAY |
| 9 | -122.25 | 37.84 | 52.0 | 3549.0 | 707.0 | 1551.0 | 714.0 | 3.6912 | 261100.0 | NEAR BAY |
Last rows
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 20630 | -121.32 | 39.29 | 11.0 | 2640.0 | 505.0 | 1257.0 | 445.0 | 3.5673 | 112000.0 | INLAND |
| 20631 | -121.40 | 39.33 | 15.0 | 2655.0 | 493.0 | 1200.0 | 432.0 | 3.5179 | 107200.0 | INLAND |
| 20632 | -121.45 | 39.26 | 15.0 | 2319.0 | 416.0 | 1047.0 | 385.0 | 3.1250 | 115600.0 | INLAND |
| 20633 | -121.53 | 39.19 | 27.0 | 2080.0 | 412.0 | 1082.0 | 382.0 | 2.5495 | 98300.0 | INLAND |
| 20634 | -121.56 | 39.27 | 28.0 | 2332.0 | 395.0 | 1041.0 | 344.0 | 3.7125 | 116800.0 | INLAND |
| 20635 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | 78100.0 | INLAND |
| 20636 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | 77100.0 | INLAND |
| 20637 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | 92300.0 | INLAND |
| 20638 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | 84700.0 | INLAND |
| 20639 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | 89400.0 | INLAND |